NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

When is Agnostic Reinforcement Learning Statistically Tractable?

Jia, Zeyu; Li, Gene; Rakhlin, Alexander; Sekhari, Ayush; Srebro, Nathan (March 2025, Advances in Neural Information Processing Systems 36)
Oh, A; Naumann, T; Globerson, A; Saenko, K; Hardt, M; Levine, S (Ed.)
We study the problem of agnostic PAC reinforcement learning (RL): given a policy class Pi, how many rounds of interaction with an unknown MDP (with a potentially large state and action space) are required to learn an epsilon-suboptimal policy with respect to Pi? Towards that end, we introduce a new complexity measure, called the spanning capacity, that depends solely on the set Pi and is independent of the MDP dynamics. With a generative model, we show that the spanning capacity characterizes PAC learnability for every policy class Pi. However, for online RL, the situation is more subtle. We show there exists a policy class Pi with a bounded spanning capacity that requires a superpolynomial number of samples to learn. This reveals a surprising separation for agnostic learnability between generative access and online access models (as well as between deterministic/stochastic MDPs under online access). On the positive side, we identify an additional sunflower structure which in conjunction with bounded spanning capacity enables statistically efficient online RL via a new algorithm called POPLER, which takes inspiration from classical importance sampling methods as well as recent developments for reachable-state identification and policy evaluation in reward-free exploration.
more » « less
Free, publicly-accessible full text available March 30, 2026
Optimistic Rates for Learning from Label Proportions

Li, Gene; Chen, Lin; Javanmard, Adel; Mirrokni, Vahab (July 2024, Proceedings of Thirty Seventh Conference On Learning Theory (COLT), PMLR, 2024.)

Full Text Available
Optimistic Rates for Learning from Label Proportions

Li, Gene; Chen, Lin; Javanmard, Adel; Mirrokni, Vahab (July 2024, Proceedings of Thirty Seventh Conference on Learning Theory, PMLR, 2024.)
Agrawal, Shipra; Roth, Aaron (Ed.)
Full Text Available
A historical sequence deletion in a commonly used Bacillus subtilis chromosome integration vector generates undetected loss-of-function mutations

https://doi.org/10.1099/mic.0.001455

Dierksheide, K Julia; Li, Gene-Wei (April 2024, Microbiology)

Since the 1980s, chromosome-integration vectors have been used as a core method of engineeringBacillus subtilis. One of the most frequently used vector backbones contains chromosomally derived regions that direct homologous recombination into theamyElocus. Here, we report a gap in the homology region inherited from the originalamyEintegration vector, leading to erroneous recombination in a subset of transformants and a loss-of-function mutation in the downstream gene. Internal to the homology arm that spans the 3′ portion ofamyEand the downstream geneldh, an unintentional 227 bp deletion generates two crossover events. The major event yields the intended genotype, but the minor event, occurring in ~10 % of colonies, results in a truncation ofldh, which encodes lactate dehydrogenase. Although both types of colonies test positive foramyEdisruption by starch plating, the potential defect in fermentative metabolism may be left undetected and confound the results of subsequent experiments.
more » « less
Full Text Available
Dueling Optimization with a Monotone Adversary

Blum, Avrim; Gupta, Meghal; Li, Gene; Manoj, Naren Sarayu; Saha, Aadirupa; Yang, Yuanyuan (February 2024, The 35th International Conference on Algorithmic Learning Theory (ALT 2024))

We introduce and study the problem of dueling optimization with a monotone adversary, a generalization of (noiseless) dueling convex optimization. The goal is to design an online algorithm to find a minimizer x* for a function f:X→R, for X \subseteq R^d. In each round, the algorithm submits a pair of guesses x1 and x2, and the adversary responds with any point in the space that is at least as good as both guesses. The cost of each query is the suboptimality of the worst of the two guesses; i.e., max(f(x1) − f(x*),f(x2) − f(x*)). The goal is to minimize the number of iterations required to find an ε-optimal point and to minimize the total cost (regret) of the guesses over many rounds. Our main result is an efficient randomized algorithm for several natural choices of the function f and set X that incurs cost O(d) and iteration complexity O(d log(1/ε)^2). Moreover, our dependence on d is asymptotically optimal, as we show examples in which any randomized algorithm for this problem must incur Ω(d) cost and iteration complexity.
more » « less
Full Text Available
When is Agnostic Reinforcement Learning Statistically Tractable?

Jia, Zeyu; Li, Gene; Rakhlin, Alexander; Sekhari, Ayush; Srebro, Nati (January 2024, Advances in Neural Information Processing Systems 36)

Full Text Available
BaM-seq and TBaM-seq, highly multiplexed and targeted RNA-seq protocols for rapid, low-cost library generation from bacterial samples

https://doi.org/10.1093/nargab/lqad017

Johnson, Grace_E; Parker, Darren_J; Lalanne, Jean-Benoit; Parker, Mirae_L; Li, Gene-Wei (March 2023, NAR Genomics and Bioinformatics)

Abstract The ability to profile transcriptomes and characterize global gene expression changes has been greatly enabled by the development of RNA sequencing technologies (RNA-seq). However, the process of generating sequencing-compatible cDNA libraries from RNA samples can be time-consuming and expensive, especially for bacterial mRNAs which lack poly(A)-tails that are often used to streamline this process for eukaryotic samples. Compared to the increasing throughput and decreasing cost of sequencing, library preparation has had limited advances. Here, we describe bacterial-multiplexed-seq (BaM-seq), an approach that enables simple barcoding of many bacterial RNA samples that decreases the time and cost of library preparation. We also present targeted-bacterial-multiplexed-seq (TBaM-seq) that allows for differential expression analysis of specific gene panels with over 100-fold enrichment in read coverage. In addition, we introduce the concept of transcriptome redistribution based on TBaM-seq that dramatically reduces the required sequencing depth while still allowing for quantification of both highly and lowly abundant transcripts. These methods accurately measure gene expression changes with high technical reproducibility and agreement with gold standard, lower throughput approaches. Together, use of these library preparation protocols allows for fast, affordable generation of sequencing libraries.
more » « less
Pessimism for Offline Linear Contextual Bandits using Confidence Sets

Li, Gene; Ma, Cong; Srebro, Nati (January 2022, Advances in neural information processing systems)

Full Text Available
Understanding the Eluder Dimension

Li, Gene; Kamath, Pritish; Foster, Dylan J.; Srebro, Nati (January 2022, Advances in neural information processing systems)

Full Text Available
First-principles model of optimal translation factors stoichiometry

https://doi.org/10.7554/eLife.69222

Lalanne, Jean-Benoît; Li, Gene-Wei (September 2021, eLife)

Enzymatic pathways have evolved uniquely preferred protein expression stoichiometry in living cells, but our ability to predict the optimal abundances from basic properties remains underdeveloped. Here, we report a biophysical, first-principles model of growth optimization for core mRNA translation, a multi-enzyme system that involves proteins with a broadly conserved stoichiometry spanning two orders of magnitude. We show that predictions from maximization of ribosome usage in a parsimonious flux model constrained by proteome allocation agree with the conserved ratios of translation factors. The analytical solutions, without free parameters, provide an interpretable framework for the observed hierarchy of expression levels based on simple biophysical properties, such as diffusion constants and protein sizes. Our results provide an intuitive and quantitative understanding for the construction of a central process of life, as well as a path toward rational design of pathway-specific enzyme expression stoichiometry.
more » « less
Full Text Available

« Prev Next »

Search for: All records